Optimum Instruction-level Parallelism (ILP) for Superscalar and VLIW Processors
نویسندگان
چکیده
Modern superscalar and VLIW processors fetch, decode, issue, execute, and retire multiple instructions per cycle. By taking advantage of instruction-level parallelism (ILP), processor performance can be improved substantially. However, increasing the level of ILP may eventually result in diminishing and negative returns due to control and data dependencies among subsequent instructions as well as resource con icts within a processor. Moreover, the additional ILP complexity can have signi cant overhead in cycle time and latency. This technical report uses a generic processor model to investigate the optimum level of ILP for superscalar and VLIW processors.
منابع مشابه
For Embedded Applications with Data-level Parallelism, a Vector Processor Offers High Performance at Low Power Consumption and Low Design Complexity. unlike Superscalar and Vliw Designs, a Vector Processor Is Scalable and Can Optimally Match Specific
Designers of embedded processors have typically optimized for low power consumption and low design complexity to minimize cost. Performance was a secondary consideration. Nowadays, many embedded systems (set-top boxes, game consoles, personal digital assistants, and cell phones) commonly perform computation-intensive media tasks such as video processing, speech transcoding, graphics, and high-b...
متن کاملA Brief Overview on Runtime-Aware Architectures
When uniprocessors were the norm, Instruction Level Parallelism (ILP) and Data Level Parallelism (DLP) were widely exploited to increase the number of instructions executed per cycle. The main hardware designs that were used to exploit ILP were superscalar and Very Long Instruction Word (VLIW) processors. The VLIW approach implies statically figuring out dependencies between instructions and sc...
متن کاملScalable Vector Processors for Embedded Systems
Designers of embedded processors have typically optimized for low power consumption and low design complexity to minimize cost. Performance was a secondary consideration. Nowadays, many embedded systems (set-top boxes, game consoles, personal digital assistants, and cell phones) commonly perform computation-intensive media tasks such as video processing, speech transcoding, graphics, and high-b...
متن کاملA Study of Out-of-Order Completion for the MIPS R10K Superscalar Processor
Instruction level parallelism (ILP) improves performance for VLIW, EPIC, and Superscalar processors. Out-of-order execution improves performance further. The advantage of out-of-order execution is not fully utilized due to in-order completion. In this report we study the performance loss due to in-order completion for MIPS R10000 processor.
متن کاملSoftware Pipelining and Superblock Scheduling: Compilation Techniques for VLIW Machines
© Copyright Hewlett-Packard Company 1992 Compilers for VLIW and superscalar processors have to expose instruction-level parallelism to effectively utilize the hardware. Software pipelining is a scheduling technique to overlap successive iterations of loops, while superblock scheduling extracts ILP from frequently executed traces. This paper describes an effort to employ both software pipelining...
متن کامل